53 research outputs found

    Parallel Graph Connectivity in Log Diameter Rounds

    Full text link
    We study graph connectivity problem in MPC model. On an undirected graph with nn nodes and mm edges, O(logn)O(\log n) round connectivity algorithms have been known for over 35 years. However, no algorithms with better complexity bounds were known. In this work, we give fully scalable, faster algorithms for the connectivity problem, by parameterizing the time complexity as a function of the diameter of the graph. Our main result is a O(logDloglogm/nn)O(\log D \log\log_{m/n} n) time connectivity algorithm for diameter-DD graphs, using Θ(m)\Theta(m) total memory. If our algorithm can use more memory, it can terminate in fewer rounds, and there is no lower bound on the memory per processor. We extend our results to related graph problems such as spanning forest, finding a DFS sequence, exact/approximate minimum spanning forest, and bottleneck spanning forest. We also show that achieving similar bounds for reachability in directed graphs would imply faster boolean matrix multiplication algorithms. We introduce several new algorithmic ideas. We describe a general technique called double exponential speed problem size reduction which roughly means that if we can use total memory NN to reduce a problem from size nn to n/kn/k, for k=(N/n)Θ(1)k=(N/n)^{\Theta(1)} in one phase, then we can solve the problem in O(loglogN/nn)O(\log\log_{N/n} n) phases. In order to achieve this fast reduction for graph connectivity, we use a multistep algorithm. One key step is a carefully constructed truncated broadcasting scheme where each node broadcasts neighbor sets to its neighbors in a way that limits the size of the resulting neighbor sets. Another key step is random leader contraction, where we choose a smaller set of leaders than many previous works do

    Log Diameter Rounds Algorithms for 2-Vertex and 2-Edge Connectivity

    Get PDF
    Many modern parallel systems, such as MapReduce, Hadoop and Spark, can be modeled well by the MPC model. The MPC model captures well coarse-grained computation on large data - data is distributed to processors, each of which has a sublinear (in the input data) amount of memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. This model is stronger than the classical PRAM model, and it is an intriguing question to design algorithms whose running time is smaller than in the PRAM model. In this paper, we study two fundamental problems, 2-edge connectivity and 2-vertex connectivity (biconnectivity). PRAM algorithms which run in O(log n) time have been known for many years. We give algorithms using roughly log diameter rounds in the MPC model. Our main results are, for an n-vertex, m-edge graph of diameter D and bi-diameter D\u27, 1) a O(log D log log_{m/n} n) parallel time 2-edge connectivity algorithm, 2) a O(log D log^2 log_{m/n}n+log D\u27log log_{m/n}n) parallel time biconnectivity algorithm, where the bi-diameter D\u27 is the largest cycle length over all the vertex pairs in the same biconnected component. Our results are fully scalable, meaning that the memory per processor can be O(n^{delta}) for arbitrary constant delta>0, and the total memory used is linear in the problem size. Our 2-edge connectivity algorithm achieves the same parallel time as the connectivity algorithm of [Andoni et al., 2018]. We also show an Omega(log D\u27) conditional lower bound for the biconnectivity problem
    corecore